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Abstract — 1\ minimization can be used to recover sufficiently 
sparse unknown signals from compressed linear measurements. 
In fact, exact thresholds on the sparsity (the size of the support 
set), under which with high probability a sparse signal can 
be recovered from i.i.d. Gaussian measurements, have been 
computed and are referred to as "weak thresholds" |4|. It was 
also known that there is a tradeoff between the sparsity and the 
£1 minimization recovery stability. In this paper, we give a closed- 
form characterization for this tradeoff which we call the scaling 
law for compressive sensing recovery stability. In a nutshell, we 
are able to show that as the sparsity backs off vj (0 < < 1) 
from the weak threshold of t\ recovery, the parameter for the 



recovery stability will scale as 
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Our result is based on 



a careful analysis through the Grassmann angle framework for 
the Gaussian measurement matrix. We will further discuss how 
this scaling law helps in analyzing the iterative reweighted £i 
minimization algorithms. If the nonzero elements over the signal 
support follow an amplitude probability density function (pdf) 
/(•) whose t-th derivative /*(0) 7^ for some integer t > 0, 
then a certain iterative reweighted £± minimization algorithm 
can be analytically shown to lift the phase transition thresholds 
(weak thresholds) of the plain £1 minimization algorithm. 

I. Introduction 

Compressive sensing addresses the problem of recovering 
sparse signals from under-determined systems of linear equa- 
tions [ 1 8 1 . In particular, if x is an n x 1 real-numbered vector 
that is known to have at most k nonzero elements where k < n, 
and A is an m x n measurement matrix with k < m < n, then 
for appropriate values of k, m and n, it is possible to efficiently 
recover x from y = Ax (TJ, [0, J3|, 0. The most well 
recognized powerful recovery algorithm is l\ minimization 
which can be formulated as follows: 
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Az=Ax 



(1) 



The first result that established the fundamental phase tran- 
sitions of signal recovery using l\ minimization is due to 
Donoho and Tanner 0, H, where it was shown that if the 
measurement matrix is i.i.d. Gaussian, for a given ratio of 
S = — , l\ minimization can successfully recover every k- 
sparse signal, provided that /i = ^ is smaller that a certain 
threshold. This statement is true asymptotically as n — > 00 and 
with high probability. This threshold guarantees the recovery 
of all sufficiently sparse signals and is therefore referred to 
as a "strong" threshold. It therefore does not depend on the 



actual distribution of the nonzero entries of the sparse signal 
and thus is a universal result. 

Another notion introduced and computed in El. BTl is that 
of a weak threshold fiw(S) under which signal recovery is 
guaranteed for almost all support sets and almost all sign 
patterns of the sparse signal, with high probability as n —> 00. 
The weak threshold is the one that can be observed in 
simulations of l\ minimization and allows for signal recovery 
beyond the strong threshold. It is also universal in the sense 
that it applies to any amplitude that the nonzero signal entries 
take. 

When the sparsity of the signal x is larger than the 
weak threshold nw(5)n, a common stability result for the 
ii minimization is that, for a set K C {1, 2, n} with 
cardinality \K\ small enough for A to satisfy the restrict 
isometry condition [3| or the null space robustness property 
lfT3l fl4l . the decoding error is bounded by, 



< Dllx- 



K 



(2) 



where x is any minimizer to i\ minimization, D is a constant, 
K is the complement of the set K and x^ is the part of x 
over the set K. 

To date, known bounds on \K\/n, for the restricted isometry 
condition to hold with overwhelming probability, are small 
compared with the weak threshold nw($) El- HI 1 14 1 use d 
the Grassmann angle approach to characterize sharp bounds 
on the stability of i\ minimization and showed that, for an 
arbitrarily small eo, as long as \K\/n = (1 — eo)/j,w(S)n, 
with overwhelming probability as n — > 00, © holds for some 
constant D (D of course depends on \K\/ri). However, no 
closed-form formula for D were given. 

In this paper, we give a closed-form characterization for this 
tradeoff which we call the scaling law for compressive sensing 
recovery stability. Namely, we will give a closed-form bound 
for D as a function of \K\/n. It is the first result of such 
kind. This result is obtained from close analysis through the 
Grassmann angle framework for the Gaussian measurement 
matrix. We will further discuss how this scaling law helps in 
analyzing the iterative reweighted l\ minimization algorithm. 

Using this scaling law results for the stability and the 
Grassmann angle framework for the weighted l\ minimization, 
we prove that a certain iterative reweighted £\ algorithm 
indeed has better weak recovery guarantees for particular 



classes of sparse signals, including sparse Gaussian signals. 
We previously introduced these algorithms in [16], and had 
proven that for a very restricted class of sparse signals they 
outperform standard £% minimization. In this paper, we are able 
to extend this result to a much wider and more reasonable 
class of sparse signals. The key to our result is the fact 
that for these signals, l\ minimization has an approximate 
support recovery property ifTTl which can be exploited by a 
reweighted £\ algorithm, to obtain a provably superior weak 
threshold. More specifically, if the nonzero elements over the 
signal support follow a probability density function (pdf) /(•) 
whose t-th derivative /*(0) ^ for some t > 0, then a 
certain iterative reweighted £\ minimization algorithm can be 
analytically shown to lift the phase transition thresholds (weak 
thresholds) of the plain l\ minimization algorithm through 
using the scaling law for the sparse recovery stability. This 
extends our earlier results of weak threshold improvements for 
sparse vectors with nonzero elements following the Gaussian 
distribution, whose pdf is itself nonzero at the origin (namely 
its 0-th derivative is nonzero) IfTTl . 

It is worth noting that different variations of reweighted 
£\ algorithms have been recently introduced in the literature 
and, have shown experimental improvement over ordinary £\ 
minimization 11151 . [7|. In [7| approximately sparse signals 
have been considered, where perfect recovery is never pos- 
sible. However, it has been shown that the recovery noise 
can be reduced using an iterative scheme. In [15], a similar 
algorithm is suggested and is empirically shown to outper- 
form l\ minimization for exactly sparse signals with non- 
flat distributions. Unfortunately, |15| provides no theoretical 
performance guarantee. 

This paper is organized as follows. In Section [II] and [HI] 
we introduce the basic concepts and system model. In Section 
IIVI we introduce and derive the main result of this paper: the 
scaling law for the compressive sensing recovery stability. In 
the following sections, we will use the scaling law to give new 
analysis results about the iterative reweighted £\ minimization 
algorithms. 

II. Basic Definitions 

A sparse signal with exactly fc nonzero entries is called fc- 
sparse. For a vector x, ||x||i denotes the l\ norm. The support 
(set) of x, denoted by supp(x), is the index set of its nonzero 
coordinates. For a vector x that is not exactly fc-sparse, we 
define the fc-support of x to be the index set of the largest k 
entries of x in amplitude, and denote it by suppkfe). For a 
subset K of the entries of x, means the vector formed by 
those entries of x indexed in K. Finally, max |x| and min |x| 
mean the absolute value of the maximum and minimum entry 
of x in magnitude, respectively. 

III. Signal Model and Problem Description 

We consider sparse random signals with i.i.d. nonzero en- 
tries. In other words we assume that the unknown sparse signal 
is an n x 1 vector x with exactly k nonzero entries, where each 



nonzero entry is independently sampled from a well defined 
distribution. The measurement matrix A is a m x n matrix 
with i.i.d. Gaussian entries with a ratio of dimensions 8 = — . 
Compressed sensing theory guarantees that if p, = ~ is smaller 
than a certain threshold, then every fc-sparse signal can be 
recovered using l\ minimization. The relationship between 5 
and the maximum threshold of p for which such a guarantee 
exists is called the strong sparsity threshold, and is denoted 
by /is(<5). A more practical performance guarantee is the so- 
called weak sparsity threshold, denoted by pw(5), and has 
the following interpretation. For a fixed value of S = — and 
i.i.d. Gaussian matrix A of size m x n, a random fc-sparse 
vector x of size n x 1 with a randomly chosen support set and 
a random sign pattern can be recovered from Ax using l\ 
minimization with high probability, if £ < fiw($)- Similar 
recovery thresholds can be obtained by imposing more or 
less restrictions. For example, strong and weak thresholds for 
nonnegative signals have been evaluated in |6|. 

We assume that the support size of x, namely fc, is slightly 
larger than the weak threshold of l\ minimization. In other 
words, fc = (1 + eo)Mw(^) f° r some eo > 0. This means that 
if we use t\ minimization, a randomly chosen ^n/(<5)n-sparse 
signal will be recovered perfectly with very high probability, 
whereas a randomly selected fc-sparse signal will not. We 
would like to show that for a strictly positive eo, the iterative 
reweighted l\ algorithm of Section [V] can indeed recover a 
randomly selected fc-sparse signal with high probability, which 
means that it has an improved weak threshold. 

IV. The Scaling Law for the Compressive Sensing 
Stability 

In this section, we will derive the scaling of the £i recovery 
stability as a function of the signal sparsity. More specifically, 
we are interested in characterizing a closed-form relationship 
between C and the sparsity | | in the following theorem. 

Theorem 1. Let A be a general m x n measurement matrix, 
x be an n-element vector and y = Ax. Denote K as a subset 
of {1,2, ... , n\ such that its cardinality \K\ — k and further 
denote K = {1, 2, . . . ,n} \ K. Let w denote an n x 1 vector. 
Let C > 1 be a fixed number. 

Given a specific set K and suppose that the part of x on 
K, namely ~x.k is fixed. Vx-^, any solution x produced by the 
£\ minimization satisfies 

2 

\\XK ||l - ||Xif ||l < ^7— j-||Xiflll 

and 

2C 

ll(x-x)7rl|i < 

;/ and only if Vw G W 1 such that Aw — 0, we have 

— 

HxK+WiHIi + ll-^Flli > ||xtf||i. (3) 
In fact, if © is satisfied, we will have the stability result 

ll(x-*hrlli ^ ??-rll x idli- 



In (9j, it was established that when the matrix A is sampled 
from an i.i.d. Gaussian ensemble, C = 1, considering a single 
index set K, there exists a constant ratio < [ivv < 1 such that 
if M< 

Hw, then with overwhelming probability as n — > oo, 
the condition (O holds for all w £ R™ satisfying Aw = 0. 
Now if we take a single index set K with cardinality ^ = 
(1 — zn)pw, we would like to derive a characterization of C, 
as a function of ^ = (1 — vj)ixw, such that the condition 
(0 holds for all w g 1" satisfying Aw = 0. The main result 
of this paper is stated in the following theorem. 

Theorem 2. Assume the to x n measurement matrix A is 
sampled from an i.i.d. Gaussian ensemble, let K be a single 
index set with = (1 — vj)fiw where \xyy is the weak 
threshold for ideally sparse signals and w is any real number 
between and 1. We also let x be an n-dimensional signal 
vector with xk being an arbitrary but fixed signal component. 
Then with overwhelming probability, the condition (O holds 
for all w £ M™ satisfying Aw — 0, with respect to the 
parameter C = 

Proof: When the measurement matrix A is sampled from 
an i.i.d. Gaussian ensemble, it is known that the probability 
that the condition (0 holds for all w £ M. n satisfying Aw = 
is the Grassmann angle, namely the probability that an 
(n — to) -dimensional uniformly distributed subspace intersects 
a polyhedral cone trivially (intersecting only at the apex of 
the cone). The complementary probability that the condition 
OJ does not hold for all w £ 1" satisfying Aw = is 
the complementary Grassmann angle. In our problem, without 
loss of generality, we scale xk (extended to an n-dimensional 
vector supported on K) to a point in the relative interior of a 
(k — 1) -dimensional face F of the weighted i\ ball, 

SP = {yGM" | ||yid|i + ||^F||i< !}• (4) 

The polyhedral cone we are interested in for the complemen- 
tary Grassmann angle is the cone SP — x^, namely the cone 
obtained by setting xjf as the apex, and observing SP from 
this apex. 

Building on the works by Santalo ifTTl and McMullen 
|[T2l in high dimensional integral geometry and convex poly- 
topes, the complementary Grassmann angle for the (k — 1)- 
dimensional face F can be explicitly expressed as the sum of 
products of internal angles and external angles [ 10 1 : 

p = 2x E E P(F,ayY(G,sp), (5) 

S>0GeSm+l+2s(SP) 

where s is any nonnegative integer, G is any (m + 1 + 2s)- 
dimensional face of the SP (3 m +i+2s(SP) is the set of all such 
faces), /?(•,•) stands for the internal angle and 7(-,-) stands 
for the external angle. 

The internal angles and external angles are basically defined 
as foUows lfT0llfT2l : 

• An internal angle 0(F\, F2) is the fraction of the hyper- 
sphere S covered by the cone obtained by observing the 



face F% from the face F\. Q The internal angle j3(F\, F%) 
is defined to be zero when F\ ^ F2 and is defined to be 
one if F\ = F 2 . 
• An external angle 7(^3,^4) is the fraction of the hy- 
persphere S covered by the cone of outward normals to 
the hyperplanes supporting the face F± at the face F3. 
The external angle 7(^3, F4) is defined to be zero when 
F 3 <£. F4 and is defined to be one if F 3 = F 4 . 
When C — 1, we denote the probability P in (O as Pi. By 
definition, the weak threshold is the supremum 
fiw such that the probability P 1 in (0 goes to as n — > 00. 
We need to show for ^ = (1 — vj)p,w and C = j±_ m t © 
also goes to as n — > 00. To that end, we only need to show 
the probability P' that, there exists an w from the null space 
of A such that 

\\*K+™ K \\l + \\^\\l + \\^\\l < ||XA-Hl (6) 

goes to as n —> 00, where Coo is a large number which we 
may take as 00 at the end, K\, K 2 and K are disjoint sets 
such that I^IJifl = fJ-wn and ~K[ \J ~K~ 2 = ~K. 

Then the probability P' will be equal to the probability that 
an (n — m) -dimensional uniformly distributed subspace inter- 
sects the polyhedral cone WSP — x#- nontrivially (intersecting 
at some other points besides the apex of the cone), where WSP 
is the polytope 

WSP = {y e R™ I \\y K \\i + |&| x + f-^h < 1}. (7) 

Then P' is also a complementary Grassmann angle, which 
can be expressed by iflOl : 

P ' = 2x E E /3(^,G) 7 (G,WSP). (8) 

s>0G69 m+ i +2 ,(WSP) 

Now we only need to show P' < P\, If we denote / = 
(m + 1 + 2s) + 1 and k = (1 — za)fiwn, in the polytope WSP, 
then there are in total (™r fc fc )2'" fc faces G of dimension (1-1) 
such that F C G and 0(F, G) ^ 0. 

However, we argue that when Coo is very large, only 
(7— fci) such faces G of dimension (/ — 1) will contribute 
nonzero terms to P' in dD, where k\ — \iwn. In fact, a certain 
(I — 1) -dimensional face G supported on the index set L is 
the convex hull of C^, where i £ L, d is the corresponding 
weighting for index i (which is 1 for the set K, Coo for 
the set K\ and C for the set K 2 ), and is the standard 
unit coordinate vector. Now we show that if K\ <£. L, the 
corresponding term in (© for the face G will be when Coo 
is very large. 

Lemma 1. Suppose that F is a (k — 1)- dimensional face of 
WSP supported on the subset K with \K\ = k. Then the 
external angle 7 (G, WSP) between an (I— \)- dimensional face 

'Note the dimension of the hypersphere S here matches the dimension of 
the corresponding cone discussed. Also, the center of the hypersphere is the 
apex of the corresponding cone. All these defaults also apply to the definition 
of the external angles. 



G supported on the set L(F C G) and the polytope WSP is 
when K\ y- L and Coo is large. 

Proof: Without loss of generality, assume K = {n — k + 
1, • • • , n}. Consider the (I — 1) -dimensional face 



n-l+l 



.,C^xe' l -^ e n - t+1 ,...,e' , } 



G = conv{C*„-; + ixe 

of WSP. The 2"~' outward normal vectors of the supporting 
hyperplanes of the facets containing G are given by 



n—k 



{J2ip e p/ C p + e p/ c p+ e p ,i P e {-1,1}}. 

p— 1 p— n— Z+l p— n — k+1 

Then the outward normal cone c(G, WSP) at the face G is 
the positive hull of these normal vectors. When K\ ^ L, the 
fraction of the surface of the (n — I — l)-dimensional sphere 
taken by the cone c(G, WSP) is since the corresponding C p 
is very large. ■ 

Now let us look at the internal angle (3(F, G) between the 
(k — 1) -dimensional face P and an (/ — 1) -dimensional face G, 
where K\ is a subset of the support set of G. Notice that the 
only interesting case is when F C G since j3(F, G) ^ only 
if F C G. We will see if F C G, the cone c(P, G) formed by 
observing G from F is the direct sum of a (k— 1) -dimensional 
linear subspace and the positive hull of (I — k) vectors. These 
(I — k) vectors are in the form 

v i = {~,...,~,0,... i C i ,0,...0),ieL\K. 

For those vectors Vi with i 6 K\, Ci = Coo. When Coo is 
very large, the considered cone takes half of the space at each 
i-th coordinate with i £ K\. 

So by the definition of the internal angle, the internal angle 
/3(P, G) is equal to 2 kl-k x j3(F, Gi), where Gi is supported 
only on the set L \ Ki. It is known that this internal angle 
fi(F, Gi ) is equal to the fraction of an (I — k% — 1) -dimensional 
sphere taken by a polyhedral cone formed by (I — k%) unit 
vectors with inner product 1+ ^ fc between each other. In this 
case, the internal angle is given by 



p(F,G) 



1 Vi. 



2 fci- 



V, 



(9) 



l-ki 



where Vi(S l ) denotes the i-th dimensional surface measure on 
the unit sphere S l , while Vi(a' , i) denotes the surface measure 
for regular spherical simplex with (i + 1) vertices on the unit 
sphere S 1 and with inner product as a' between these {i + 1) 
vertices. Thus © is equal to B( 1+ ^-i k , I — ki), where 

B(a',m') = 9^ ^/(m! - l)a' + l7r- m ' /2 a' _1/2 J(m', 9), 

(10) 

with 6 = (1 - a') /a' and 

-1 rOO />00 

J(m',0) = -= (/ e' 9v2+2mX dv) m ' e~ x2 dX. (11) 

V 71 " J-oo Jo 

If we take C = y= , then 

vl— 



1 



1 



By comparison, j3(F, G) = 2tl 1 _ fc x /3(F, G) is exactly 
the 2fcl 1 - fc /3(i r i, Gi) term appearing in the expression for the 
Grassmann angle P between the face Pi supported on the set 
K\ and the polytope SP, where G\ is an (/ — 1) -dimensional 
face of SP supported on the set L. 

Similar to the derivation for the internal angle, we can show 
that the external angle 7(G,WSP) is also exactly equal to 
7(Gi, SP) term appearing in the expression for the Grassmann 
angle P between the face Pi supported on the set K\ and the 
polytope SP, where Gi an (I — 1) -dimensional face of SP 
supported on the set L. 

Since there are in total only (™J^ i 1 )2' _fc such faces G of 
dimension (I — 1) will contribute nonzero terms to P' in dD, 
substituting the results for the internal and external angles, we 
have P = P'. Thus for M = (i _ w \n w an d G = -^=, 



with high probability, the condition the condition (0) holds for 
all w G K™ satisfying Aw = 0. 

■ 

V. Iterative Weighted g 1 Algorithm 

Beginning from this section, we will see how the stability 
result is used in analyzing the iterative reweighted l\ min- 
imization algorithms. We focus on the following algorithm 
from |[T6l . ifTTl . consisting of two l\ minimization steps: a 
standard one and a weighted one. The input to the algorithm 
is the vector y = Ax, where x is a fc-sparse signal with 
k = (1 + eo)fiw(S)n, and the output is an approximation x* 
to the unknown vector x. We assume that k, or an upper bound 
on it, is known. Also ui > 1 is a predetermined weight. 

Algorithm 1. 4771/ 

1) Solve the t\ minimization problem: 



x = argmin ||z||i subject to Az = Ax. 



(12) 



1 + C 2 k 1 + fei ' 



2) Obtain an approximation for the support set of x: find 
the index set L C {1, 2, n} which corresponds to the 
largest k elements of x in magnitude. 

3) Solve the following weighted l\ minimization problem 
and declare the solution as output: 

x* = argmin ||z L ||i + w||z^-||i subject to Az = Ax. 

(13) 

The idea behind the algorithm is as follows. In the first 
step we perform a standard l\ minimization. If the sparsity 
of the signal is beyond the weak threshold [iw(6)n, then l-y 
minimization is not capable of recovering the signal. However, 
we can use its output to identify an index set L in which 
most elements correspond to the nonzero elements of x. We 
finally perform a weighted l\ minimization by penalizing 
those entries of x that are not in L because they have a lower 
chance of being nonzero elements. 

In the next sections we formally prove that, for certain 
classes of signals, Algorithm Q] has a recovery threshold 
beyond that of standard l\ minimization. The idea of the 
proof is as follows. In Section [VI] we prove that there is a 
large overlap between the index set L, found in Step 2 of 



the algorithm, and the support set of the unknown signal x 
(denoted by K) — see Theorem [3] Then in Section IVII1 we 
show that the large overlap between K and L can result in 
perfect recovery of x, beyond the standard weak threshold, 
when a weighted l\ minimization is used in Step 3. 

This proof idea was already used in ifTTIl to prove a thresh- 
old improvement in recovering sparse vectors with Gaussian 
distributed nonzero elements by using a numerical evaluation 
of the robustness 

VI. Approximate Support Recovery, Steps 1 and 2 
of the Algorithm 

In this section, we carefully study the first two steps of 
Algorithm Q] The unknown signal x is assumed to be a k- 
sparse vector with support set K, where k = \K\ = (1 + 
co)iJ,w(fi)n, for some eo > 0. The set L, as defined in the 
algorithm, is in fact the fc-support set of x. We show that for 
small enough eo, the intersection of L and K is very large 
with high probability, so that L can be counted as a good 
approximation to K. 

We now lower bound \L n K\. First, we state a general 
lemma that bounds \K n L\ as a function of ||x — x||i IfTTIl . 
Then, we recall an intrinsic property of l\ minimization called 
weak robustness that provides an upper bound on the quantity 
l|x-x||i. 

Definition 1. MTU For a k-sparse signal x, we define W(~x., A) 
to be the size of the largest subset of nonzero entries o/x that 
has a t\ norm less than or equal to A. 

W(x, A) := max{|S| | S C supp(x), ||x s ||i < A} 

Note that W(x, A) is increasing in A. 

Lemma 2. HI 7V Let x be a k-sparse vector and x be another 
vector. Also, let K be the support set of x and L be the k- 
support set of x. Then 



must be finite, and one can write 



\KC\L\ > k- W(x, ||x-x||i) 



(14) 



We now review the notion of weak robustness, which allows 
us to bound ||x — x||i, and has the following formal definition 

Definition 2. Let the set S C {1, 2, • • • ,n} and the subvector 
xs be fixed. A solution x is called weakly robust if for some 
C > 1 called the robustness factor, and all Xg, it holds that 

2C „ „ 



|(x-xW||i < 



C-V 



x- 



slH- 



(15) 



The weak robustness notion allows us to bound the error in 
||x — x||i in the following way. If the matrix As , obtained 
by retaining only those columns of A that are indexed by S, 
has full column rank, then the quantity 

||wsr||i 



2C(1 + k) „ „ 

|X-X||1 < ^ , l lXgll! 



(16) 



C-\ 

From [9 1 and the scaling law discovered in this paper, we 
know that for Gaussian i.i.d. measurement matrices A, l\ 
minimization is weakly robust, i.e., there exists a robustness 
factor C > 1 as a function of — < nw(8) f° r which (fT~5T > 
holds. Now let k\ = (1 — e^^wi^n f° r some small t\ > 0, 
and K\ be the k\ -support set of x, namely, the set of the 
largest k\ entries of x in magnitude. Based on equation (fl~6b 
we may write 



x||i < — — — llxgj-Hi 



(17) 



C- 1 

For a fixed value of 5, C in ( ITTb is a function of t\ following 
the scaling law discovered in this paper, and becomes arbi- 
trarily close to 1 as t\ — y 0. k is also a bounded function of 
ei and therefore we may replace it with an upper bound n* . 
We now have a bound on ||x — x||i. To explore this inequality 
and understand its asymptotic behavior, we apply a third result, 
which is a certain concentration bound on the order statistics of 
the random variables following certain amplitude distributions. 

Lemma 3. Suppose X\,X2,- ■ • ,Xjy are N i.i.d. random 
variables whose amplitudes, with a mean value of E(\X\), 
follow the probability density function f(x) for x > 0. Let 
Sn = ^2iLi \Xi\ and let Sm be the sum of the smallest M 
numbers among the \Xi\, for each 1 < M < N. Then for 
every e > 0, as N —¥ oo, we have 

P(|% 
Vl N 

xf(x)dx\ > e) -> 0, 



ON 



E(\X\ 
1 



> e) 0, 



E(\X\)j 

where F(x) is the corresponding cumulative distribution func- 
tion for the considered random variable amplitude \X\. 

Without loss of generality, we assume E(\X |) = 1. As a direct 
consequence of Lemma [3] we can write: 



n\- 



X 



Kill 1 



|x||i 

for all e > as n 



C(eo) := 



inf 

£l>0 



F -i('a + 'i \ 
V i +£Q ) 



— > oo. Define 
2C(1 + k*) 



xf{x)dx\ > e) -> (18) 



1 i+» > 



C-l jo 
Combining ( ITTb with ( fT8l we can get 



xf(x)dx > 



P(i 



X 1 



C(eo) < e) 



1 



(19) 



Aw=0,w/0 ||Wg-||i 



for all e > as n — > oo. In summary, we have showed that 
\K H L\ > k — W(x, ||x — x||i), and then "weak robustness" 
of i\ minimization guarantees that for large n with high 
probability ||x — x||i < C(eo)||x||i. These results will further 
lead to the main claim on the support recovery, which extends 
a similar claim in [17] by using the closed-form scaling law 
result in this paper. 



Theorem 3 (Support Recovery). Let A be an i.i.d. Gaussian 
mxn measurement matrix with — = 8. Let k = (l+eo)uw(S) 
and x be an n x 1 random k-sparse vector whose nonzero 
element amplitude follows the distribution of f(x). Suppose 
that x is the approximation to x given by the l\ minimization, 
namely x = argmin Az=Ax\\z>\\ i- Then, for any eo > and 
for all e > 0, as n — > oo, 

\su PP ( x )nsu PPk m > ^ 

k 

where y* is the solution to y in the equation xf(x)dx = 

CM- 

Moreover, if the integer t > is the smallest integer for 
which the amplitude distribution f(x) has a nonzero t-th order 
derive at the origin, namely /^(0) ^ 0, then as eo — >• 0, with 
high probability, 

\supp(x.) n supp k (x)\ = 1 Q^Y^-) (21) 
k 

The proof of Theorem __ relies on the scaling law for 
recovery stability in this paper and concentration Lemma [3] 
Note that if eo 0, then Theorem [3] implies that ^ K< ^ 
becomes arbitrarily close to 1. We can also see that the support 
recovery is better when the probability distribution function 
of f(x) has a lower order of nonzero derivative. This is 
consistent with the better recovery performance observed for 
such distributions in simulations of the iterative reweighted t\ 
minimization algorithms. 

VII. Perfect Recovery, Step 3 of the Algorithm 

In Section _7_ we showed that, if eo is small, the fc-support 
of x, namely L — suppk(x), has a significant overlap with 
the true support of x. The scaling law gives a quantitative 
lower bound on the size of this overlap in Theorem [3] In Step 
3 of Algorithm __ weighted l\ minimization is used, where 
the entries in L are assigned a higher weight than those in 
L, In |8]|, we have been able to analyze the performance of 
such weighted £\ minimization algorithms. The idea is that if 
a sparse vector x can be partitioned into two sets L and L, 
where in one set the fraction of non-zeros is much larger than 
in the other set, then ( fT3l can potentially increase the recovery 
threshold of l\ minimization. 

Theorem 4. [8] Let L C {1,2, ••• ,n} , uj > 1 and the 

fractions /i, /a G [0, 1] be given. Let 71 = ^ and 72 = 1— 71. 
There exists a threshold <5 c (7i, 72, fi, f2, such that with 
high probability, almost all random sparse vectors x with at 
least /i7in nonzero entries over the set L, and at most filin 
nonzero entries over the set L can be perfectly recovered using 
minAz=Ax II z l||i+w||z^||i, where A is a S c nxn matrix with 
i.i.d. Gaussian entries. Furthermore, for appropriate lu, 

|Uw(6 c (7i)72,/i,/2,w)) < /171 + /272, 

i.e., standard l\ minimization using a S c n x n measurement 
matrix with i.i.d. Gaussian entries cannot recover such x. 



A software package for computing such thresholds can 
also be found in [19|. We then summarize the threshold 
improvement result in the following theorem, with the detailed 
proofs omitted due to limited space. 

Theorem 5 (Perfect Recovery). Let A be an mxn i.i.d. Gaus- 
sian matrix with ^ = 5. If S c (uw(S), 1 — u\y(5), 1, 0, lu) < S, 
then there exist e > and lu > such that, with high 
probability as n grows to infinity, Algorithm __ perfectly 
recovers a random (1 + eo)uyy(6)n-sparse vector with i.i.d. 
nonzero entries following an amplitude distribution whose pdf 
has a nonzero derive of some finite order at the origin. 
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